Scalable register bypassing for FPGA-based processors
نویسندگان
چکیده
In this paper, a scalable scheme, configurable via register-transfer level parameters, for full register bypassing in a modern embedded processor architecture, termed ByoRISC, is presented. The register bypassing specification is parameterized regarding the number of homogeneous register file read and write ports and the number of pipeline stages of the processor. The performance characteristics (cycle time, chip area) of the proposed technique have been evaluated for FPGA target implementations of the synthesizable ByoRISC model. It is proved that, a full bypassing network is a viable solution for the elimination of data hazards when servicing instructions with multiple read and write operands. While the maximum clock frequency is reduced by 17.9% in average, when using partial versus full forwarding, the positive effect of custom computation eliminates this effect by providing cycle speedups of 3.9× to 5.5× and corresponding execution time speedups for a ByoRISC testbed processor of 3.6×. Individual application speedups of up to 9.4× have also been obtained.
منابع مشابه
Use of compiler optimization of software bypassing as a method to improve energy efficiency of exposed data path architectures
In the design of embedded systems, hardware and software need to be co-explored together to meet targets of performance and energy. With the use of application-specific instruction-set processors, as a stand-alone solution or as a part of a system on chip, the customization of processors for a particular application is a known method to reduce energy requirements and provide performance. In par...
متن کاملScalable MPEG-4 Encoder on FPGA Multiprocessor SOC
High computational requirements combined with rapidly evolving video coding algorithms and standards are a great challenge for contemporary encoder implementations. Rapid specification changes prefer full programmability and configurability both for software and hardware. This paper presents a novel scalable MPEG-4 video encoder on an FPGA-based multiprocessor systemon-chip (MPSOC). The MPSOC a...
متن کاملModel-Based FPGA Embedded-Processor Systems Design Methodologies: Modeling, Syntheses, Implementation and Validation
The evolution of field programmable gate arrays (FPGAs) as custom-computing machines for digital signal processing (DSP), real-time embedded and reconfigurable systems development, embedded processors, and as co-processors for application specific integrated circuit (ASIC) prototyping has led to the emergence of several modeling and design methodologies among which are the register transfer lev...
متن کاملArchitecture and Performance of the Hitachi SR2201 Massively Parallel Processor System
RISC-based Massively Parallel Processors (MPPs) often show low efficiency in real-world applications because of cache miss penalty, insufficient throughput of the memory system, and poor inter-processor communication performance. Hitachi's SR2201, an MPP scalable up to 2048 processors and 600 GFLOPS peak performance, overcomes these problems by introducing three novel features. First, its proce...
متن کاملA scalable register file architecture for dynamically scheduled processors
A major obstacle in designing dynamically scheduled processors is the size and port requirement of the register file. By using a multiple banked register file and performing dynamic result renaming, a scalable register file architecture can be implemented without performance degradation. In addition, a new hybrid register renaming technique to efficiently map the logical to physical registers a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Microprocessors and Microsystems - Embedded Hardware Design
دوره 33 شماره
صفحات -
تاریخ انتشار 2009